Newest 'model-evaluations cross-validation' Questions

0votes

0answers

23views

How to use cross validation to select/evaluate model with probability score as the output?

Initially I was evaluating my models using cross_val with out-of-pocket metrics such as precision, recall, f1 score, etc, or with my own metrics defined in ...

szheng

21

asked Apr 29, 2024 at 19:58

0votes

0answers

87views

XGBoost Classifier Evaluation Confusion on New Dataset Despite High Cross-Validation Scores

I have built an XGBoost classifier model with 90 features, trained on a dataset containing 760k samples. I took great care to separate the labels from the features in both the training and testing ...

oklen

1

asked Oct 5, 2023 at 18:17

0votes

3answers

877views

For cross validation should I use training set, or whole dataset?

I'm new to data science and I have a problem understanding what dataset to use when using cross validation for model evaluation. Let's say I have two models: LogisticRegression and ...

Michał Jurzak

13

asked Aug 7, 2023 at 17:58

0votes

1answer

747views

Can I use GridSearchCV.best_score_ for evaluation of model performance?

Scikit-learn page on Grid Search says: Model selection by evaluating various parameter settings can be seen as a way to use the labeled data to “train” the parameters of the grid. When evaluating the ...

Charlie

103

asked Jul 5, 2023 at 9:10

0votes

1answer

116views

How do I know If my regression model is underfitting?

How do we evaluate the performance of a regression model with a certain RMSE given that a domain knowledge performance metric is not present? Maybe MAPE is one way of comparing the performance of my ...

Mehmet Deniz

41

asked Mar 30, 2023 at 11:50

1vote

0answers

306views

Grouped stratified train-val-test split for a multilabel dataset

So this is indeed nontrivial. I was wondering if there is a fast heuristic algorithm for performing grouped stratified dataset split on a multilabel dataset. Stratification is usually performed to ...

jasperhyp

23

asked Dec 15, 2022 at 19:47

1vote

0answers

633views

How does exactly eval_set and RandomizedSearchCV work for LightGBM?

How does RandomizedSearchCV form the validation sets, while I also defined an evaluation set for LGBM? Is it formed from the train set I gave or how does the evaluation set comes into the validation? ...

morqueatsz

25

asked Jun 28, 2022 at 12:41

0votes

0answers

52views

How to evaluate model accuracy at tail of empirical distribution?

I am making a nonlinear regression on stationary dependent variable and I want to precisely forecast extreme values of this variable. So when my model predicts extreme values I want them to be highly ...

Łukasz Czop

1

asked Dec 22, 2021 at 8:10

0votes

2answers

320views

I am attempting to implement k-folds cross validation in python3. What is the best way to implement this? Is it preferable to use Pandas or Numpy? [closed]

I am attempting to create a script to implement cross validation in data. However, the splits cannot randomly take any records, so the training and testing can be done on equal data splits for each ...

AGX301

73

asked Sep 23, 2021 at 22:25

1vote

1answer

725views

n_jobs=-1 or n_jobs=1?

I am confused regarding the n_jobs parameter used in some models and for CV. I know it is used for parallel computing, where it includes the number of processors specified in n_jobs parameter. So if I ...

spectre

2,203

asked Jul 23, 2021 at 16:54

1vote

0answers

126views

Imbalanced dataset, finding the statistical significance of a Matthews Correlation Coefficient (MCC) in binary classification (what is a good MCC)?

I have a very imbalanced dataset. Thus, I am using MCC to evaluate the performance of various ML algorithms. It appears that literature is entirely lacking in ways to evaluate how good an MCC score is....

Prospero

31

asked Apr 7, 2021 at 18:17

0votes

1answer

2kviews

Machine Learning validation data returns 100% accuracy [closed]

I'm Testing a Machine Learning model with validation data returns that return 100% correct answers, is it overfitting or the model works extremely well, do I need to continue training on more data? I'...

MXK

184

asked Oct 29, 2020 at 18:02

0votes

1answer

379views

Difference between validation and prediction

As a follow-up to Validate via predict() or via fit()? I wonder about the difference between validation and prediction. To keep it simple, I will refer to train, <...

Ben

570

asked Oct 17, 2019 at 12:12

1vote

1answer

115views

Validity of cross-validation for model performance estimation

When applying cross-validation for estimating the performance of a predictive model, the reported performance is usually the average performance over all the validation folds. As during this procedure,...

C.S.

23

asked Sep 7, 2019 at 8:29

5votes

3answers

7kviews

In k-fold-cross-validation, why do we compute the mean of the metric of each fold

In k-fold-cross-validation, the "correct" scheme seem to compute the metric (say the accuracy) for each fold, and then return the mean as the final metric. Source : https://scikit-learn.org/stable/...

Alexis Pister

607

asked Jun 14, 2019 at 8:42

Stack Exchange Network

All Questions

How to use cross validation to select/evaluate model with probability score as the output?

XGBoost Classifier Evaluation Confusion on New Dataset Despite High Cross-Validation Scores

For cross validation should I use training set, or whole dataset?

Can I use GridSearchCV.best_score_ for evaluation of model performance?

How do I know If my regression model is underfitting?

Grouped stratified train-val-test split for a multilabel dataset

How does exactly eval_set and RandomizedSearchCV work for LightGBM?

How to evaluate model accuracy at tail of empirical distribution?

I am attempting to implement k-folds cross validation in python3. What is the best way to implement this? Is it preferable to use Pandas or Numpy? [closed]

n_jobs=-1 or n_jobs=1?

Imbalanced dataset, finding the statistical significance of a Matthews Correlation Coefficient (MCC) in binary classification (what is a good MCC)?

Machine Learning validation data returns 100% accuracy [closed]

Difference between validation and prediction

Validity of cross-validation for model performance estimation

In k-fold-cross-validation, why do we compute the mean of the metric of each fold

Hot Network Questions

All Questions

Related Tags